What is new in Stable Diffusion 2.0?

What is new in Stable Diffusion 2.0? It’s trained with a brand new text encoder OpenCLIP and Depth-to-Image diffusion model.
text-to-image
image generation
Published

November 24, 2022

Keywords

image generation, stable diffusion, deep learning,

While the world was already amazed by the performance of open source text to image model, Stable Diffusion 1.x. Stability AI has released a new version with a lot of improvements.

List of notable updates:

  • Trained using a new text encoder, OpenCLIP, developed by LAION with support from Stability AI.
  • The text-to-image models can now generate images with default resolutions of both 512x512 pixels and 768x768 pixels.
  • The models are trained on subset of LAION-5B dataset after filtering out adult content using NSFW filter.
  • Stable Diffusion 2.0 comes with an Upscaler Diffusion model that enhances the resolution of images by a factor of 4.
  • Depth-to-Image: It extends the version 1 image-to-image by also considering depth of an input image for generation of a new image.
  • It also brings a new text guided image inpainting diffusion model, finetuned on the new Stable Diffusion 2.0 base text-to-image.